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Abstract 

The Foldy-Wouthuysen iterative diagonalization technique is applied to 
the Helmholtz equation to obtain a Hamiltonian description of the propaga- 
tion of a monochromatic quasiparaxial light beam through a system in which 
the refractive index n(x,y,z) varies about a background value no such that 
\n(x, y,z) — n \ <C n . This technique is presented as an alternative to the 
conventional method of series expansion of the radical. Besides reproducing 
all the traditional quasiparaxial terms, this method leads to additional terms 
in the optical Hamiltonian. 
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1 Introduction 



In the traditional scalar wave theory for treating monochromatic quasiparax- 
ial light beam propagating along the positive z-axis, the z-evolution of the 
optical wave function ip{r) is taken to obey the Schrodinger-like equation 

f X ?m = H^r ) , (!) 

where A = \/2ti = c/uj. The optical Hamiltonian H is formally given by the 
radical 

H = - (n 2 (r) - pi) V2 , (2) 

where p = — i\ V. Then, one expands the square root in a series, with suitable 
ordering, or symmetrization, of the resulting polynomials in the components 
of r± and p ± , if necessary, to get a Hermitian H [l]-[5]. Note that n{r) and 
p ± = — iAVj_ do not commute. Here we shall use the Foldy-Wouthuysen 
(FW) iterative diagonalization procedure, well known in the treatment of 
the Dirac theory of electron [6] to develop an alternative to this technique of 
obtaining the optical Hamiltonian. 



2 The traditional paraxial formalism 

Let the system we are considering have its optic axis along the z-direction 
and let the refractive index of the medium, n(r), vary by a small amount 
around a background, or mean, value n . That is, n(r) = n — e(r), \e(r)\ 
n , and nl — n 2 (r) ss 2n e(r) <^ n^. For a monochromatic quasiparaxial 
beam, with leading ^-dependence ip(r) ~ exp (in z/Xj, we have n . 
This means that p z ~ no, and that all rays propagate almost parallel to the 
positive z-axis. Thus, the expansion of H in the small parameters /no 
and e(r)/n is basically an expansion in l/n . We have 

H = -(n 2 (r)-pi) 1/2 

w -{nl- (pl + 2n e) } 1/2 

= "™o + (77-Pi + e ) +77- (77-Pl + £ 



2n ) 2n \2no 

2 , _\ , / 1 -2 
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If the terms up to first order in l/n only are retained in H, dropping the 
terms proportional to the second and higher powers of 1/uq, one gets the 
paraxial theory applicable to an ideal first order, or linear, system without 
any aberrations, or nonlinearities. To treat a nonlinear or aberrating system, 
to a given order, one has to keep in H the terms proportional to powers of 
l/n up to the desired order of approximation. To treat a given system the 
corresponding Hamiltonian is chosen, up to desired order of accuracy, and 
the integration of the optical Schrodinger equation (1) leads to the required 
transfer (or Green's) function, for the evolution of ip(r) across the system, 
from the input transverse plane to the output transverse plane. 

To arrive at the above formalism one usually starts with the scalar optical 
wave equation 

(V-=^W.')=<>, (4) 



c 



and specializes to the monochromatic case ty(r,t) = ip{r) exp (—iut). Then, 
tp(r) satisfies the Helmholtz equation 



V 2 + ^]^(r) = 0. (5) 



Now, the optical Schrodinger equation (1) follows from rewriting (5) as 

-(\§^\(r) = {n 2 (r)-pi)ii(r), (6) 

and then choosing the 'square root' as 

{iX-^j Hr) = ~ (n 2 (r) -pi) 1/2 ^(r) , (7) 

corresponding to the requirement that the propagation be entirely in the 
positive z-direction; if the propagation is in the negative z-direction, with 
ip(r) ~ exp (— inoz/X^j, the right hand side of (7) will have the opposite sign. 
As seen from (3), to first order in l/n , (7) becomes 



<A-^«(-no + — + (8) 
the paraxial, or the parabolic, approximation to the Helmholtz equation. 
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It should be noted that the passage from (5) to (7) reduces the original 
boundary value problem to a first order initial value problem in z. This 
reduction is of great practical value since it leads to the powerful system or 
the Fourier optic approach [7] However, the above reduction process itself can 
never be claimed to be rigorous or exact. Hence there is room for alternative 
procedures for the reduction. Of course, any such reduction scheme is bound 
to lack in rigor to some extent, and the ultimate justification lies only in the 
results the scheme leads to. The purpose of this note is to explore one such 
possibility based on the analogy between (4) and the Klein-Gordon equation 
for a spin-0 particle. Before beginning this exploration, it is useful to recount 
briefly some of the other attempts to go beyond the paraxial regime. 

3 Beyond the paraxial approximation 

There have been several notable attempts to go beyond the paraxial approx- 
imation and, in that process, to obtain a precise knowledge of the meaning 
and accuracy of paraxial wave optics itself. We highlight here only some of 
these, and the reader may consult these works for further references. 

A significant early attempt in this regard is due to Lax et al. [8]. These 
authors pointed out that the process of neglecting grad div E and seeking a 
solution that is plane polarized in the same sense everywhere is simply in- 
compatible with the exact Maxwell equations. They developed an expansion 
procedure in powers of the beam parameter (w /£), where w is the waist 
size and £ = 2itWq/X is the diffraction length or (twice the) Rayleigh range 
of the beam under consideration. In addition to showing that the zero-order 
field obeyed the Maxwell system of equations, they developed the equations 
obeyed by the higher-order corrections. The first-order correction was shown 
to be longitudinal. 

Agarwal and Pattanayak [9] studied the propagation of Gaussian beam 
in a simple [linear, homogeneous, and isotropic] dielectric using the angu- 
lar spectrum representation for the electric field, and showed that the ex- 
act solution consisted of the paraxial result plus higher-order non-Gaussian 
correction terms. They demonstrated, in particular, that the second-order 
correction term satisfied an equation consistent with the work of Lax et al. 
cited above. In another paper Agarwal and Lax [10] examined the role of 
the boundary condition in respect of the corrections to the paraxial approx- 
imation of Gaussian beams. This work resolved the controversy between the 
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work of Couture and Belanger [11] and that of Agarwal and Pattanayak [9], 
by tracing it to the fact that the two works had used qualitatively different 
boundary conditions for the correction terms: while Agarwal and Pattanayak 
had made the more natural demand that the field distribution in the waist 
plane be strictly Gaussian in the exact treatment, as in the paraxial case, 
Couture and Belanger had demanded the on-axis field to be the same in both 
treatments. 

A major step in directly connecting solutions of the paraxial equation 
to those of the exact treatment was taken by Wiinsche [12], who showed 
that it is possible to construct a linear transition operator which transforms 
arbitrary solutions of the paraxial equation into exact (monochromatic) so- 
lutions of the scalar wave equation (Helmholtz equation). Indeed, Wiinsche 
constructed two such operators 7\, T2 correspondng to two different bound- 
ary conditions and noted, moreover, that the transition operator method is 
equivalent to the complete integration of the system of coupled differential 
equations of Lax et al, restricted to the scalar case. Cao and Deng [13] de- 
rived a simpler transform operator under the condition that the evanescent 
waves can be ignored, and used this transform to study the corrections to the 
paraxial approximation of arbitrary freely propagating beam. They verified 
the consistency of their conclusions with those of the perturbation approach 
of Lax et al. Subsequently, Cao [14] applied the method of Lax el al to 
nonparaxial light beam propagating in a transversely nonuniform refractive 
index medium, computed the correction terms in detail, and specialized the 
results to the case of Gaussian beam propagating in transversely quadratic 
refractive index media. 

The transition operator method has been further extended by Alonso 
et al. [15]. The uncertainty product has played a fundamental role in the 
analysis of paraxial (both coherent and partially coherent) beams. Alonso 
and Forbes [16] have recently generalized the uncertainty product to the case 
of nonparaxial fields. 

There are two aspects to the issue of going beyond paraxial optics. The 
first one is to do with the spatial dependence alone, and hence applies to 
'scalar' optics. The second one is to do with the vectorial nature of the light 
field and, more specifically, with the fact that Maxwell's is a constrained 
system of equations. The restriction div E = (in free space and in homo- 
geneous dielectrics) demands that the spatial dependence of the field, even 
in a transverse plane like the waist plane, cannot be chosen independent of 
polarization. [Thus, an input plane-polarized plane wave going through a 
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thin spherical lens results, in the exit plane of the lens, in a wave which is 
spherical and hence necessarily has spatially-varying polarization so that the 
Poynting vector at all points in the exit plane points to the focus of the lens.] 
Though the work of Lax et al. pointed to both these aspects, the subsequent 
ones noted above largely concentrated on the first. 

Examining the fundamental Poincare symmetry of the Maxwell system 
in the front-form of Dirac, Mukunda et al. [17] developed a formalism which 
converts a solution of the scalar wave equation into a corresponding (ap- 
proximate) solution of the Maxwell system, resulting in a generalization of 
Fourier optics for vector electromagnetic beams [18]. This formalism leads 
to simple-looking electromagnetic Gaussian beams [19] , and predicts a cross- 
polarization term [20] which is consistent with experimental observations [21]. 
Further analysis of electromagnetic Gaussian beams has been presented by 
Sheppard and Saghafi [22], and by Chen et al. [23]. 

We describe below some preliminary results of an ongoing research on the 
use of the FW transformations to study nonparaxial beams and the passage 
through optical systems which are not necessarily paraxial (Gaussian). There 
are two primary reasons for our believing that this approach may have ad- 
vantage over the ones noted above. First, the FW technique iteratively takes 
the field to a new representation where the forward propagating components 
get progressively decoupled from the backward propagating components. Sec- 
ondly, the FW method appears ideally suited for the Lie algebraic approach 
[5]. Finally, the FW technique generalizes to the vector case with very little 
extra effort, as will be shown in a subsequent report. 

4 The Foldy-Wouthuysen formalism 

In the traditional scheme the purpose of expanding the Hamiltonian H = 

1/2 

— (n 2 (r) — p 2 ±j m a series using l/n as the expansion parameter is to 
understand the propagation of the quasiparaxial beam in terms of a series 
of approximations (paraxial + nonparaxial). Let us recall that in relativistic 
quantum mechanics too one has a similar problem of understanding the rela- 
tivistic wave equation as the nonrelativistic approximation plus the relativis- 
tic correction terms in the quasirelativistic regime. For the Dirac equation 
(which is first order in time) this is done using the FW transformation lead- 
ing to an iterative diagonalization technique. For the Klein-Gordon equation 
(which is second order in time) this is done using the same FW technique 
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after linearizing it with respect to time, and thus bringing it to a Dirac-like 
form, following the Feschbach-Villars method [6] . Nonrelativistic Schrodinger 
equation and the Klein-Gordon equation applicable in the cases of ion and 
electron optics, when the spin is disregarded, have also been treated in a 
similar way using the FW technique [24, 25]. The analogy between the op- 
tical wave equation (4) and the Klein-Gordon equation suggests naturally a 
similar technique for treating the scalar wave theory of light beams. Though 
the suggestion to employ the FW technique in the case of the Helmholtz 
equation exists in the literature as a remark [26] it has not so far been ex- 
ploited to analyze the quasiparaxial approximations for any specific beam 
optical system. 

Written as a first order system, the Helmholtz equation reads 



iX 



d_ 

dz 





n 2 (r)-p\ 



1 




fX^(r) 



(9) 



Now, let us define 



(i) 
+ 
(i) 



t raj) oz T 



(10) 



The effect of the transformation in (10) for a quasiparaxial beam moving 
in the forward z-direction is to separate the component which is 'fast' in z 
from the one which is 'slow' : there exists one linear combination of tp and 
dip/dz which varies rapidly in z and another which varies slowly in z, and 
the above transformation picks precisely these components. In our case of 
forward propagation, since ip(r) ~ exp (irioz/Xj, we have \f r +' ) ~ V ; ( r ) an d 



^> . In other words, \E r Y ; and are, respectively, the large and 
the small components of Vt^ 1 ). Let us now rewrite the Helmholtz equation 

(5), or (9), as 



(i) 



zX^— = & 1 W 1 \ 

oz 



(11) 



with 



H (D 



-n a z + + , 
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V2n J 



(12) 



where cr^ and cr 2 are, respectively, the y and z components of the triplet of 
Pauli matrices, cr, namely, 



(5 T. 



' 


1 " 




' -i ' 




' 1 


" 


1 







i 


, c 2 = 





-1 



(13) 



This form of the Helmholtz equation is analogous to the Feschbach-Villars 
form of the Klein-Gordon equation. Note that equation (11) thus derived is 
algebraically very similar to the Dirac equation. Like in the Dirac equation, 
FT 1 ) is the sum of a leading diagonal term —n a z (analogous to mc 2 (3), a di- 
agonal 'even' term £^ which does not couple the large and the small compo- 
nents of S^ 1 ), and the off-diagonal 'odd' term which couples them. The 
even term £W commutes with a z and the odd term anticommutes with 
a z :£^a z = cr z £^ and (D^a z = —a z (D^ l \ This perfect analogy between the 
Dirac equation and (11) enables us to use the standard FW transformation 
technique to analyze (11) in terms of paraxial and higher order expansions 
with l/n as the expansion parameter. This technique works as follows. 

In FT 1 ) the off-diagonal odd term is small compared to the diagonal part 
—no<J z + £W in which the leading term is of order no- A series of successive 
transformations of ^f^, following a fixed recipe described below, is applied 
to (11) such that after each transformation the off-diagonal odd term of the 
resulting equation becomes weaker, being of higher order in l/n . If at any 
stage the odd term is considered weak enough to be neglected then one can 
approximate the corresponding iterated Hamiltonian by keeping only its di- 
agonal part. Thus, this iterative diagonalization scheme can be carried out 
systematically up to any desired order in the expansion parameter l/n . It 
is interesting to note that (11) corresponds already to the paraxial approxi- 
mation (8) of the Helmholtz equation if the odd term is dropped from 
FT 1 ), retaining only the diagonal part — n§o z + £^\ 

The first FW transformation is 



*( 2 ) = exp (-^a (1) /2n ) 
1 / 1 



exp 



2no \2n 



-p ± + e a 



* (1) . 



(14) 
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The result of this transformation is to turn (11) into 

.0^(2) 



iX- 



dz 



(15) 



with 

H< 2 > = 

i<n = 

0(2) = 



-n a z + # a > + 6® , 



+ iX 



dz 



+ 



2nn 



dz J 2n { 



-<y. 



6(i)^(D] 



+ 



(16) 



Note that in IT 1 ) the odd term is of order (l/n ). In H( 2 ) the odd 
term is of order (1/no) 2 . Hence in (15) the odd part is weaker than the 
diagonal part, by one more order, compared to (11). Further, note that the 
basic algebraic structure, £^a z = a z £^ and <D^a z = —a z 0^ 2 \ is preserved 
by the iteration. 

The second FW transformation is 



*( 3 ) = exp (-a z O^/2n ) *< 2 > . 
The result of this transformation is to turn (15) into 

.0^(3) 



i\- 



dz 



with 



R^ = -n a z +£^ + O^, 
where £^ and are obtained by the replacements 

£(3) = 5(2)^(1)^5(2)^(1)^^(2) 
0(3) = 0(2)^(1)^5(2)^(1)^0(2)^ 



(17) 



(18) 



(19) 



(20) 



This happens because the expressions in (16) follow as a result of the general 
algebraic properties of the even and odd operators, independent of the spe- 
cific expressions for £^ and given in (12). Further, £^a z = <r z i<® and 
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dm (7z = -a z 6^\ and 6^ i s of order (l/n ) 3 . Thus the odd part of is 
weaker than the odd part of IF 2 ) by one more order. Now, it is straightfor- 
ward to see how this sequence of FW transformations can be continued up 
to any desired stage, making the odd part of the resulting weaker by one 
more order at each stage. 

If the FW transform process is stopped at, say, the j-th stage then one 
would have arrived at 

iX^— = H^* (i) , (21) 

dz 



with 



/ vl>0') \ 



(22) 



where is of order (l/n ) 3 . It is important to note that each FW trans- 
formation preserves and improves the property that the upper component 
of Vl/^ is large compared to its lower component: ^> . In view of 

this, we can drop the odd part O^) from as negligible compared to the 
diagonal part — n a z + and write 

. x ^r)^^ W = -n + ^g ) , (23) 

az 

where is the 11 matrix element of 8^ and has been simply relabeled 
■ip. Equation (23) is the j-th order approximation to the Helmholtz equation 
in this approach. 

As already noted the first order approximation corresponds to the usual 
paraxial theory. In this case, equation (23) becomes 

H = -n + 8^ = -n + ^-p\ + e . (24) 

2n 

Let us now look at the second order approximation. From (12), (15), (16), 
and (23), we have, keeping terms up to order (1/no) 5 , 
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H 



-n + S{? 



-n + 
1 



1 -2 \ 
2n~ P± + £ ) + 16nl 



Pi 



dz 



2hq \2n 



1 



-n + 



2ru 



-Pl + e + 



-P± + e 



1 



2n,Q \2n 



1 



-Pl + e + 



A 2 g 
16no <9z 



(p ± • V ± e + V ± e ■ p ± ) + . . . 



(25) 



Comparing (25) with the traditional expansion in (3) it is clear that Ti has 
the first few terms of the traditional Hamiltonian H correctly, plus an extra 
term (the commutator term). To get the higher order terms of H in 7i with 
the correct coefficients one will have to consider approximations beyond the 
second order. We assume, for consistency, that the derivatives of e{r) are 
also small compared to tiq. 

One may note that the FW scheme automatically leads to a Hermitian 
Hamiltonian without need for any further ordering of its noncommuting com- 
ponents. 



5 Concluding remarks 



It is interesting that the extra commutator term 



16n;t 



P± ' dz 



in (25) con- 



tributes a correction to the optical Hamiltonian, even at the 'paraxial level', 
when the refractive index of the medium suffers both longitudinal and trans- 
verse inhomogeneities. Such a ^-derivative term is not natural to the tradi- 
tional power series expansion. This commutator term is what survives in the 



commutator term — 



< s «r, 



+ f\^P\ in the expression for 8^ 



in (16). In the Foldy-Wouthuysen formalism of the Dirac theory the corre- 
sponding commutator term is responsible for the correct explanation of the 
spin-orbit energy (including the Thomas precession effect) and the Darwin 
term (attributed to the zitterbewegung) (see Sectioin 4.3 of [6]). Similarly 
in the nonrelativistic reduction and interpretation of the Klein-Gordon equa- 
tion using the Foldy-Wouthysen transformation theory such a commutator 
term corresponds to the Darwin term correcting the classical electrostatic 
interation of a point charge in analogy to the zitterbewegung of the Dirac 
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electron (see Section 9.7 of [6]). In the quantum theory of beam optics of 
charged Klein-Gordon and Dirac particles [24, 25] the corresponding terms 
add to the Hamiltonian the lowest order quantum corrections to the classical 
aberration terms. In view of this analogy it should be of interest to study 

P-L' dz 



the effect of this correction term to the optical Hamiltonian, ~ 2 



16ng 



on the propagation of Gaussian beams in a parabolic index medium whose 
focusing strength is modulated in the axial variable z, however tiny it may 
be compared to the classical terms. 

Other questions naturally suggested by the above preliminary report are: 
the issue of convergence in respect of the series expansion resulting from the 
FW method and the boundary condition the series satisfies in relation to the 
paraxial result in any special plane like the waist plane of the beam. The 
precise relation between the FW series and the results of the other approaches 
recounted in Section 3 are not immediately clear at the moment, but it is 
an important issue worth investigating. We hope to return to these issues 
elsewhere. 

The authors would like to thank the Referees for some insightful com- 
ments leading to improved clarity of presentation. 
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